Patterns are specified as strings. Default syntax is emacs-style.
Special Characters (using default syntax):
. matches any character
* 0 or more of preceeding regular expresssion
+ 1 or more of preceeding regular expresssion
? 0 or 1 of preceeding regular expresssion
[ ] defines character set: '[a-zA-Z]' to match all letters
[^ ] defines complemented character set: matches if char is NOT in
set
^ matches empty str at beginning of line
$ matches empty str at end of line
\ quoting char: \[ matches char '['
\\ matches '\'; due to Python string rules, write as '\\\\' in the pattern string.
\| specifies alternative: 'foo\|bar' matches 'foo' or 'bar'
\( \) grouping (for \|, or complicated expr, or substr for future reference by \D character or group() method)
\D D is digit: matches substr matched by D'th \( \) in pattern
\` empty str at beginning of file
\' empty str at end of file
\b empty str at beg or end of word: '\bis\b' matches 'is', but not
'his'
\B empty str NOT at beginning or end of word
\< empty str at beginning of word
\> empty str at end of word
\w any word constituent
\W any non-word constituent
Variables:
error -- Exception when pattern string isn't valid
regexp.
Functions:
match(pattern, string) -- Return how many characters at the beginning of <string> match regexp <pattern> or -1 if none.
search(pattern, string [, pos]) -- Return the first position in <string> that matches regexp <pattern>. Return -1 if none. [starting at <pos>.]
compile(pattern [,translate]) -- Create regexp object that has methods match() and search() working as above. Also group(i1, [,i2]*). Also regs, tuple of positions matched; regs[0] is whole match, next are subexpressions. E.g. p = compile('id\([a-z]\)\([a-z]\)') p.match('idab') ==> 4 p.group(1, 2) ==> ('a', 'b') p.regs ==> ((0, 4), (2, 3), (3, 4), ...)
set_syntax(flag) -- Set syntax flags for future calls to match(), search() and compile(). Returns current value. Flags in module regex_syntax.
symcomp(pattern [,translate]) -- Like compile but with symbolic group names. Names in angle brackets. Access through group method. E.g. p = symcomp('id\(<l1>[a-z]\)\(<l2>[a-z]\)') p.match('idab') ==> 4 p.group('l1') ==> 'a'